Monaural segregation of voiced speech using discriminative random fields
نویسندگان
چکیده
Techniques for separating speech from background noise and other sources of interference have important applications for robust speech recognition and speech enhancement. Many traditional computational auditory scene analysis (CASA) based approaches decompose the input mixture into a time-frequency (T-F) representation, and attempt to identify the T-F units where the target energy dominates that of the interference. This is accomplished using a two stage process of segmentation and grouping. In this pilot study, we explore the use of Discriminative Random Fields (DRFs) for the task of monaural speech segregation. We find that the use of DRFs allows us to effectively combine multiple auditory features into the system, while simultaneously integrating the the two CASA stages into one. Our preliminary results suggest that CASA based approaches may benefit from the DRF framework.
منابع مشابه
Preliminary intelligibility tests of a monaural speech segregation system
Human listeners are able to understand speech in the presence of a noisy background. How to simulate this perceptual ability remains a great challenge. This paper describes a preliminary evaluation of intelligibility of the output of a monaural speech segregation system. The system performs speech segregation in two stages. The first stage segregates voiced speech using supervised learning of h...
متن کاملMonaural Voiced Speech Segregation Based on Pitch and Comb Filter
The correlogram is an important mid-level representation for periodic sounds which is widely used in sound source separation and pitch detection. However, it is very time consuming. In this paper, we presented a novel scheme for monaural voiced speech separation without computing correlograms. The noisy speech is firstly decomposing into time-frequency units. Pitch contour of the target speech ...
متن کاملSegregation of unvoiced speech from nonspeech interference.
Monaural speech segregation has proven to be extremely challenging. While efforts in computational auditory scene analysis have led to considerable progress in voiced speech segregation, little attention has been given to unvoiced speech, which lacks harmonic structure and has weaker energy, hence more susceptible to interference. This study proposes a new approach to the problem of segregating...
متن کاملAn Auditory Scene Analysis Approach to Monaural Speech Segregation
A human listener has the remarkable ability to segregate an acoustic mixture and attend to a target sound. This perceptual process is called auditory scene analysis (ASA). Moreover, the listener can accomplish much of auditory scene analysis with only one ear. Research in ASA has inspired many studies in computational auditory scene analysis (CASA) for sound segregation. In this chapter we intr...
متن کاملMonaural Speech Segregation Based on Pitch
Introduction The goal of the proposed algorithm is to separate speech signals in monaural recordings even in very adverse conditions when significant background noise and additional speakers are present at the same time. Particularly we try to decide for each time frequency region which of the different sound sources dominates and then build for each sound source a binary mask which is one at t...
متن کامل